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Abstract 

Motivated by variational models in continuum mechanics, we introduce a novel 
algorithm for performing nonsmooth and nonconvex minimizations with linear con- 
straints. We show how this algorithm is actually a natural generalization of well-known 
non-stationary augmented Lagrangian methods for convex optimization. The relevant 
features of this approach are its applicability to a large variety of nonsmooth and non- 
convex objective functions, its guaranteed global convergence to critical points of the 
objective energy, and its simplicity of implementation. In fact, the algorithm results 
in a nested double loop iteration, where in the inner loop an augmented Lagrangian 
algorithm performs an adaptive finite number of iterations on a fixed quadratic and 
strictly convex perturbation of the objective energy, while the external loop performs 
an adaptation of the quadratic perturbation. To show the versatility of this new algo- 
rithm, we exemplify how it can be easily used for computing critical points in inverse 
free-discontinuity variational models, such as the Mumford-Shah functional, and, by 
doing so, we also derive and analyze new iterative thresholding algorithms. 
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1 Introduction 

Minimizers of integrals in calculus of variations typically possess singularities, which arise 
as the result of the nonsmoothness or nonconvexity of the energy. For certain problems in 
continuum mechanics, such singularities may represent physically interesting instabilities. 
Singular minimizers exist that model aspects, for instance, of solid phase transformations 
and certain modes of fracture. More in general, the arising of singular minimizers has 
to be expected when the energy functionals to be minimized are nonconvex. Relevant 
examples are given for instance by the most realistic models encountered in the literature 
of elasticity theory (see e.g., [22, 28, 4]). In this context, local minimizers play a pivotal 
role, as often evolution of physical phenomena proceeds along such energy critical points. 
Furthermore, usually the given problems have additional conditions, for instance boundary 
conditions, to be taken into account, which result in linear constraints to be satisfied by the 
(local) minimizer. Therefore the appropriate solution of constrained genuinely nonconvex 
problems is of the highest interest as well as the accurate numerical treatment of the 
singularities which are expected to characterize the minimizers. 

In the literature one can find efficient algorithmic solutions for linearly constrained 
convex and nonsmooth minimization, e.g., augmented Lagrangian methods [34, 23, 25], 
and for linearly constrained nonconvex minimization, such as sequentially quadratic pro- 
gramming (SQP) or (semi-smooth) Newton methods [33]. Unfortunately, in the latter 
cases only smooth objective energies, usually at least C 2 functionals, can be addressed 
by algorithms, which are then guaranteed to converge only locally around the expected 
critical point (local minimizer). 

A typical case which is in fact not falling into the class of treatable problems through 
such mentioned algorithms is the so-called Mumford-Shah functional [29], originally intro- 
duced as a model of denoising and segmentation for digital images, and more in general 
widely used as a schematization of many problems in mathematical physics where both 
volume (bulk) and surface energies are present. One of the most popular methods to 
address the numerical minimization of nonconvex functionals, modelling the more general 
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class of free- discontinuity problems [1] including the Mumford-Shah functional, is by T- 
approximation with functionals where the surface energy is more easily handled in terms 
of a smooth indicator function of the discontinuity set of minimal solutions, as proposed by 
Ambrosio and Tortorelli in [3]. However, very recent adaptive numerical implementations 
of this regularized model for brittle fracture simulation [7] showed that such relaxation 
is highly unstable with respect to the approximation parameters, as it very likely creates 
several spurious local minimizers, resulting in unreliable simulations, unless the resolution 
of the adaptive finite element discretization is extremely fine. Hence, in this case, the 
interplay between smoothing and discretization tends to give nonphysical approximations 
to the problem. 

On the other hand, there are also efficient semi-heuristic methods, mainly designed for 
similar image processing models as the Mumford-Shah variational formulation, to seek 
for global minimizers of nonconvex energies. In these cases, global optimization can be 
addressed using either stochastic algorithms, such as the simulated annealing (SA) [26], or 
continuation-based deterministic relaxations, such as the graduated nonconvexity (GNC) 
pioneered by Blake and Zisserman [5], see also recent developments in [32, 30] and refer- 
ences therein. Being of some conceptual relevance for the scope of this paper, we mention 
how this latter technique works. For a suitable parameter e £ [0,1], one considers a con- 
tinuous family of objectives J e such that lim e _ s> i l 7 e = J (at least pointwise), where J 
is the nonconvex energy to be minimized. Then one addresses the global minimization 
of J by iterated local minimizations along J e when e is increasing from to 1 with a 
strictly convex initial J' . More formally, we consider an increasing sequence (e n ) n( zfq, 
with eo = and lim n e n = 1 and the algorithm 

v n+1 = arg min J 6n {v), (1.1) 

where M en (v n ) is a suitable neighborhood of the previous iteration v n of size possibly 
depending on e n . While such semi-heuristic algorithms perform very well in practice, 
usually they do not provide any guarantee for global convergence. 

Summarizing, both the interest in rigorous algorithmic procedures to find constrained 
critical points of nonconvex energies, and the relevant applications to free-discontinuity 
problems, lead us to the motivation of this paper. Its first goal is to propose a very gen- 
eral iterative algorithm to solve nonsmooth and nonconvex optimization problems with 
linear constraints. For nonsmoothness we mean that we require our objective function to 
be in general only a locally Lipschitz function, contrary to the much more restrictive C 2 
regularity usually requested by most of the known methods. Moreover, as one of the most 
relevant features of our iteration, we will show its global convergence, i.e., the initial state 
does not need to be in a neighborhood of a critical point. Our algorithm is in fact an 
appropriate combination of the above mentioned techniques, resulting in a nested double 
loop iteration, where in the inner loop an augmented Lagrangian algorithm performs an 
adaptive finite number of iterations on a fixed local quadratic perturbation of the objec- 
tive energy, while the external loop performs an adaptation of the quadratic perturbation. 
Conceptually our algorithm is reminiscent of the continuation-based deterministic relax- 
ation (1.1), but it will have stronger convergence guarantees, which provide also some 
rigorous justification to those semi-heuristic methods. 
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In the second part of this paper, we demonstrate how to implement this algorithm for 
addressing free-discontinuity problems, in particular the minimization of the Mumford- 
Shah functional. We show how to reformulate the Mumford-Shah minimization problem 
into a linearly constrained nonsmooth and nonconvex minimization involving truncated 
polynomial energy terms, for which the inner loop, i.e., the augmented Lagrangian feature 
of our algorithm, can be realized by means of an iterative thresholding algorithm. This 
technique has been as first proposed in [21] to solve free-discontinuity problems in one 
dimension. The extension we provide in this paper allows us now to address successfully 
problems, which are defined in any dimension, thanks to the appropriate handling of cor- 
responding linear constraints. 

Thresholding algorithms have by now a long history of successes, based on their extremely 
simple implementation, their statistical properties, and, in the iterative case, strong con- 
vergence guarantees. We retrace briefly some of the relevant developments, without the 
intention of providing an exhaustive mention of the many contributions in this area. The 
terminology "thresholding" comes from image and signal processing, however the asso- 
ciated mathematical concept is the Moreau proximity map [11], well-known from convex 
optimization. The statistical theory of thresholding has been pioneered by Dohono and 
Johnstone [16] in signal and image denoising and further and extensively explored in other 
work, e.g., [9]. Iterative soft-thresholding algorithms to numerically solve the minimization 
of convex energies, modelling inverse problems and formed by quadratic fidelity terms 
and £p-norm penalties, for p > 1, have been first proposed in [18]. Their strong con- 
vergence has been proven in the seminal work of Daubechies, Defrise, and De Mol [13]. 
The recent theory of compressed sensing, i.e., the universal and nonadaptive compressed 
acquisition of data [8, 15], stimulated also the research of iterative thresholding algorithms 
for nonconvex penalty terms, such as the ^,-quasinorms for < p < 1. Variational and 
convergence properties of iterative firm-thresholding algorithms, in particular the iterative 
hard-thresholding, have been recently studied in [6, 20]. Partially inspired by these latter 
achievements and the work of Nikolova [31] on the relationships between certain threshold- 
ing operators and Mumford-Shah functionals, the results in [21] and in the present paper 
should be also considered as a contribution to the theory of thresholding algorithms in the 
context of linearly constrained nonsmooth and nonconvex optimization. 
While in this paper we are limiting ourself to present the application to the minimization of 
Mumford-Shah type of functionals, mainly for their typical hard features of nonconvexity 
and nonsmoothness, in our view the algorithm we study in this work will have significant 
further numerical applications in several problems involving nonsmooth and nonconvex 
energies with additional linear (boundary) conditions, as in fracture propagation [22], in 
elasto-plastic evolutions [28], and in atomic structure computation [4]. 

The paper is organized as follows. In Section 2 we define an appropriate concept of con- 
strained critical points for certain classes of nonconvex functionals. We introduce then our 
new algorithm for the solution of nonsmooth and nonconvex minimization with linear con- 
straints and we prove its convergence to critical points. Section 3 is addressed to showing 
how the general algorithm can be used for the solution of inverse free-discontinuity prob- 
lems, starting with the reformulation of the classical discrete version of the Mumford-Shah 
model in terms of a nonsmooth and nonconvex optimization with linear constraints. In the 
concluding Section 4 we show how the core of the algorithm for inverse free-discontinuity 
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problems can actually be realized as an iterative thresholding algorithm. Besides its rele- 
vance in terms of simplicity of implementation, the insight provided by certain segmenta- 
tion properties of such iterations actually allows us to verify all the necessary conditions 
for the algorithm to converge. 

2 Linearly Constrained Nonsmooth and Nonconvex Mini- 
mization 

2.1 Preliminaries and assumptions 

Let H be a finite dimensional Hilbert space and J : % — > K a lower semicontinuous 
functional which we assume to be bounded from below. Since we will be concerned with 
the search of critical points, without any loss of generality we shall suppose from now on 
that J{v) > 0, for all v G %. Let %\ be another finite dimensional Hilbert space and we 
further consider a linear operator A: % — > H\. Both the spaces H and H\ are endowed 
with an Hilbertian norm, which we will denote in both cases with || • ||, since it will be 
always clear from the context in which space we are taking the norm. Dealing with finite 
dimensional spaces, it remains understood that the only notion of convergence that we 
will use is the strong convergence in norm, since weak and strong topologies are in this 
case equivalent. 

About the operator A, we shall assume that A has nontrivial kernel, and is surjective. 
We shall denote by A* : %i — > % the adjoint operator of A. By our assumptions, for 
every w G %\ we have that there exists 5 > such that 

\\A*w\\ > 8\\w\\ . (2.1) 

We consider / G H\ and we are concerned with the problem of finding constrained 
critical points of J on the affine space J-(f) '■= {v G % : Av = /} . As usual in nonsmooth 
analysis, the notion of critical point is defined via the use of subdifferentiation. 

Definition 2.1. Let H be an Hilbert space, J : H->la lower semicontinuous functional, 
and v G % . We say that £ G %' ~ % belongs to the subdifferential dj{v) of J at v if 
and only if 

|inM JM-(JWtp-»», G 

w— >v \\yj — v\\ 

The subdifferential can be in general empty or multivalued. It is well-known (see, for 
instance [2, Chapter 1]) that it is a closed convex set. In the case where J is convex, it 
is nonempty at every point and it can be shown (see again [2, Proposition 1.4.4]) that 

i G dj(v) if and only if J{w) - (J{v) + (£, w - v)) > (2.3) 

for every ioGH. In the case of a C 1 perturbation of a lower semincontinuous functional, 
that is J = J\ + J2 where J\ is lower semicontinuous, and J2 is of class C 1 , it follows 
from the definition that if dj\(v) is nonempty, then dj{v) / and the decomposition 

dJ(v) = dJ 1 (v) + DJ 2 (v), (2.4) 
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holds true. Here D denotes the Frechet differential of Ji at v. In particular, C - 
perturbations of lower semicontinuous convex functionals have nonempty subdifferential 
at every point. We collect in the following Remark some useful properties of the subdif- 
ferential that will be employed in the sequel. 

Remark 2.2. If J is a C 1 -perturbation of a convex function, one proves that the subd- 
ifferential enjoys the following closure property: 

£ n £dJ(v n ), v n ^v, £ n £ implies £ G dj(v) and J(v n ) ->• J(v) . (2.5) 

The subdifferential of a convex function J is known to be a monotone operator [17], 
that is, for every v and w G % 

£ G dj{v) and u G dj(w) implies (£ - u,v - w) >0 . (2.6) 

We shall say that a function is v -strongly convex if a stronger form of (2.6) holds, that 
is there exists v > such that 

£ £ dj(v) and uj G dj(w) implies (£ — uj, v — w) > u\\v — w\\ 2 . (2.7) 

It is well-known that this is equivalent to saying that J{-) — 1 1| ■ || 2 is convex. 

We are now ready to recall the definition of critical point and constrained critical point. 

Definition 2.3. Let % be an Hilbert space, J: % — > R a lower semicontinuous functional, 
and v G % . We say that v is a critical point of J if either J has no subdifferential at v 
or 

G dj{v) . 

Clearly this last condition is the only one of the two to be meaningful whenever J 
has nonempty subdifferential at every point. When J is convex it is sufficient to assure 
global minimality of v, otherwise it is only a necessary condition for local minimality. 

Definition 2.4. Given a linear operator A: % — > T-L\ with nontrivial kernel, and / G Hi, 
we say that w is a critical point of J on the affine space F(f) = {v G % : Av = /} if 
Aw = f and is a critical point for the restriction to ker A of the functional J{w + ■). 

For J being a C 1 -perturbation of a convex function (in particular, with nonempty 
subdifferential at every point), the nonsmooth version of Lagrange multiplier Theorem 
assures that w is a critical point of J on the affine space {v G % : Av = /} if and only 
if Aw = f and 

dJ(w)nran{A*) /0, (2.8) 

where ran(yl*) is the range of the operator A* which is known to be the orthogonal 
complement of ker A in % . 

From now, about the function J , we will make the following more specific assumptions: 

(Al) J is w-convex, that is there exists uj > such that J{-) + ■ || 2 is convex; 

(A2) the subdifferential of J satisfies the following growth condition: there exist two 
nonnegative numbers K , L such that, for every v G % and £ G dj{v) 

U\\<KJ(v) + L. (2.9) 
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Remark 2.5. (a) We observe that condition (Al) is in fact met, for instance, by any C 1 
function with piecewise continuous and bounded second derivatives. We also recall that an 
ui -convex function is a C 1 -perturbation of a convex function, therefore it has nonempty 
(and locally bounded) subdifferential at every point. If the subdifferential is uniformly 
bounded, then (2.9) is trivially satisfied. 

(b) In finite dimension, by Rademacher's Theorem an a; -convex function has a Frechet 
differential almost everywhere, and, if (2.9) is satisfied only at points of differentiability, 
then it holds everywhere. This is true since it can be shown that the Frechet subdifferential 
is contained in the so-called Clarke subdifferential, which is known to be at every v G % the 
convex hull of limit points of differentials of J along sequences v n — > v (for these notions, 
see for instance [10, Chapter 2]). Therefore one needs not to calculate the subdifferential 
of J at non-differentiability points (which is in general quite a hard task) to check if the 
hypothesis is satisfied everywhere. 

Given ui > 0, and u G % we will denote 



Notice that J WjU is coercive whenever J is bounded from below. We observe that, if 
J satisfies (Al) we can always assume that lo is chosen in such a way that J^ )U is also 
^-strongly convex with v depending on J and uj, but not on u. Analogously, if (Al) 
and (A2) are satisfied, by using (2.4) it is easy to see that Ju,u satisfies (2.9) with two 
constants K , L depending again on J and uj , but not on u . 

2.2 The augmented Lagrangian algorithm in the convex case 

We now recall some basic facts about augmented Lagrangian iterations for constrained 
minimization of convex functionals. Here, we are given a coercive convex functional J 
and, given two arbitrary vq G %, qo G H\ such that A*qo £ 9J{vq), for every k G N, 
k > 1 , we define: 



Convergence of the algorithm has been proved in [34], where it was called Bregman 
iteration, and, since it is equivalent to the Augmented Lagrangian Method [25], also in [23]. 
Precisely it has been shown that \\Avk — f\\ decreases to as k tends to +oo, that the 
sequence Vk is compact and any limit point is a global minimum of J under the constraint 
Av = f. Moreover, for every k, A*qi s G dJiv^). When J is v -strongly convex for some 
v>0we have also a quantitative estimate of the convergence of vt to the unique (due 
to strict convexity) minimizer of the problem. We give a precise statement and a proof of 
this additional property, as it will be very useful later in the nonconvex case as well. 

Proposition 2.6. Assume that J is v -strongly convex, let Vk and qk the sequences 
generated by (2.11), and let v the unique global minimizer of J on the affine space {v G 
H : Av = /} . Then: 

(i) {\\Avk — /||)fceN is a decreasing sequence; 



Ju,u{v) ■■= J{v) + U)\\v - U 



(2.10) 



v k G arg mm v&i (J(v) - (q k ,Av) + X\\Av - f\\ 2 ) . 
q k = qk-i + 2A(/ - Av k ) ■ 



(2.11) 
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(ii) lim fc ^ +00 \\Av k - /|| = 0; 

(Hi) \\v k - v\\ 2 < l\\q - q\\ \\Av k - f\\ , for all k G N, 
for every q £ Hi such that A*q £ dj(u) . 

Proof. Properties (i) and (ii) are proved in [34]. For the property (iii), we first observe 
that such a q surely exists by (2.8). We define Aq k := q k — q and we prove that ||Agfc|| is 
decreasing. We actually have, by elementary computations and using (2.11), that 

||A^.|| 2 - ||Ag fe _i|| 2 < 2(q k - q k -i,qk ~ q) = 
4A(/ - Av k , q k -q)= A\(v - v k , A*q k - A*q) . 

Since A*q k G dj{v k ) and A*q G dj{u), the last term in the inequality is nonpositive by 
(2.6), therefore the claim follows. In particular 

\\qk - q\\ < Iko - q\\ ■ (2.12) 

Now, by (2.7), we have also 

v\\vk ~ vf < {A*q k - A*q, v k - v) = (q k - q, Av k - f) , 

so that we conclude by the Cauchy-Schwarz inequality and (2.12). □ 

When J is the function J WjU defined by (2.10), with an appropriate choice of oj, by 
the previous result, (2.1), and (2.9), we get the following corollary, whose rather immediate 
proof is therefore omitted. 

Corollary 2.7. Consider the function 3w,u defined by (2.10), where oj is chosen in such 
a way that J u ,u is v -strongly convex with v not depending on u . Let v u the unique global 
minimizer of Ju, u on the affine space {v G H : Av = /} . Then there exist two positive 
constants C\ and C2 depending on A* , J , and oj , but not on u, such that 

\\v k , u ~ vu\\ 2 < [Ci(l + Hobll) + C 2 J w , u {v u )] \\Av Ku - /|| , (2.13) 

where v k>u := v k is defined accordingly to (2.11) for J = J^^ u . 

2.3 The algorithm in the nonconvex case 

We now present the new algorithm for linearly constrained nonsmooth and nonconvex 
minimization, and discuss its convergence properties. We pick initial vr 0j0 \ £ H and 
Q(o,o) £ "^1 • Notice that there is no specific restriction to any specific neighborhood for 
the choice of the initial iteration. For a fixed scaling parameter A > , and an adaptively 
chosen sequence of integers (Lg)g e ^, for every integer t > 1 we set (with the convention 
£o = 0): ' 

v (£,o) = v e-i ■= v (e-i,L e _ x ) ?(^,o) = qe-i '■= <l(i-i,L t -i) 
' V(e,k) = ar S mm «e« {Ju,v t -A v ) ~ (Q{e,k-i), Av ) + M\ Av ~ f\?) , k = l,...,L e 
,q(e,k) = Q(t,k-i) + 2X (f ~ A V(e,k)) ■ 

(2.14) 
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Here, thanks to condition (Al), uj is chosen in such a way that J w , Vl _ 1 is ^-strongly 
convex, with v independent of , and the finite number of inner iterates Li is defined 
by the condition 

(l + II^IDP^)-/!^!, a >l. (2.15) 

Since the inner loops are simply the augmented Lagrangian iterations for the functional 
Ju,vt_-t i by Proposition 2.6 (ii) and (2.12) such an integer Li always exists. We also remark 
that by construction, for every I > 1 and k = 0, . . . , with the only possible exception 
°f 9(i,o) ! we have 

A *Q(t,k) G ft7wM_i(f(*,k)) • (2-16) 

Moreover, for every £ > 1, again by Proposition 2.6, ||A^m — f\\ is nonincreasing in k. 

Let us also remark that (2.14) is actually a natural generalization of (2.11), as if J 
were convex, we could in fact choose lo = 0, and (2.14) would simply reduce to (2.11). 



2.4 Analysis of convergence 

We now want to analyse the convergence properties of the algorithm defined by (2.14). 
To do that we will use the following basic calculus lemma. 

Lemma 2.8. Let (ag)g e ^ a sequence of positive numbers, and let (<5^ g n a positive de- 
creasing sequence such that 

oo 

If a,£ satisfies for every I the inequality 

at < (1 + 6g-i)a£-i + 5g-i , 
then (a()e e jq is a Cauchy sequence. 
Proof. By the recurrence relation (2.17) we deduce 



k=0 



n a+4) 

.k=e'+i 



Notice that 



log 



Y[(l + S k 



.k=0 



X>g(l + **) 



k=0 

oo 



fc=0 



for suitable G (1,1 + 4) , for k G N, hence 

oo 

n(i+4) < oo, 



(2.17) 



(2.18) 



(2.19) 



fc=0 
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and, together with (2.18), we deduce that (ag)eeN is actually uniformly bounded. Now, 
again by the recurrence relation (2.17), for k' < k, we obtain 

k k k 

(a k -a k >)= ^2 ( a e~ae-i)< ^ k-iae-i + ^2 S e -!. 
e=k'+i e=k'+i e=k'+i 

We conclude from the boundedness of (ag)^ and YITLq that (ae)een is also a Cauchy 
sequence. □ 

In the following theorem we analyse the convergence properties of the proposed algo- 
rithm. 

Theorem 2.9. Assume that J satisfies (Al) and (A 2), and let (vg)g^ be the sequence 
generated by (2.14). Then, 

(a) (Ave ~ f) -> as I -)• oo; 

(b) (v£ — V£-i) — > as I — ^ oo. 

If in addition J is coercive then V£ is bounded and (J~(vg))g^ is a convergent sequence. 
More in general, if J only satisfies (Al) and (A2), the implication 

(vijeeN is a bounded sequence implies {J{vg))g^ is convergent (2.20) 

holds. 

Proof. Part (a) of the statement is a direct consequence of the construction of V£ and 
Proposition 2.6 (ii). We now set for every £ 

V£ := arg min A „ =/ J aJ; ^_ 1 (v) . (2.21) 

By (2.13) with u = vg-i, k = Lg, and qo = qufi\, and by (2.14) and (2.15), we have that 
there exist two positive constants C\ and C2 independent of £, such that 

IN " vtf < [Ci + C 2 Ju, Vl _ 1 {n)\^ • (2-22) 
By this latter estimate and the minimality of vg + \ we get 



Ju,v e (ve+i) = J{V£ +1 ) +uj\\v e - v t+1 



2 



< J(yt) + u\\v e -v t \\ 2 ^JM + ^ + ^JuM^ive) (2.23) 

C\0J / C2UA 



By Lemma 2.8 we eventually deduce that (JL,^_ 1 (^))^eN is a Cauchy sequence, in par- 
ticular it is bounded. Therefore, there exists a constant C independent of i such that, by 
(2.22), 



C 



\ Ve+1 -V£ +i r<- (FT ^, (2.24) 



10 



and, by (2.23), we have also 



J{v£+i) < J{v e+ i) + oj\\v e - vg +1 \\ 2 < J(v e ) + — . (2.25) 

Again Lemma 2.8 entails now that 

J(vt) is a convergent sequence , (2.26) 

so that, by (2.25) we get that (vg — V£+i) — > as i goes to +oo, and this vanishing 
convergence, combined with (2.24), gives part (b) of the statement. 

Being J locally Lipschitz as it is an w-convex function, if vg is uniformly bounded, 
by (2.24) and (2.26) we immediately conclude that (J"(^))feN is a convergent sequence. 
Moreover, if J is coercive, then vg is bounded by (2.26), and so is also (ve)eeN by (2.24), 
as required. □ 

As a consequence we get our main result of this section. Whenever vg is bounded, every 
cluster point is a constrained critical point of J on the affine space {v £ H : Av = /} . 
We again recall that boundedness of vg is guaranteed by Theorem 2.9 when J is assumed 
to be coercive. 

Theorem 2.10. Assume that J satisfies (Al) and (A2), and let (vg)g^ be the sequence 
generated by (2.14). // (vg)g^ is bounded, every of its limit points is a constrained critical 
point of J on the affine space {v G H : Av = /} . 

Proof. Let (qi)eeN De tli e sequence defined by (2.14), and let pg := A*qg, and pg := 
pg — 2co(vg — V£-{). By (2.4) and (2.16), we have 

pg G dj{y t ) , (2.27) 

and by the boundedness of {vg)g^ and the local Lipschitz continuity of J we then get 
that pg is bounded too. By Theorem 2.9, part (b), we deduce that pg — pg — > 0, which in 
particular gives 

lim dist(& , ran(A*)) = . (2.28) 

Now, if a subsequence V£ j — >■ v G % , possibly taking a further subsequence we may assume 
that f>£. — > p G dj(v), where the last inclusion follows from (2.5) and (2.27). Moreover, 
since in finite dimension ran(^4*) is closed, by (2.28) p G ran(A*). Since Av = / by part 
(a) of Theorem 2.9, (2.8) yields now the desired conclusion. □ 

3 The Application to Free-Discontinuity Problems 

3.1 The Mumford-Shah problem and its reformulation as a finite di- 
mensional linearly constrained nonconvex minimization 

The terminology 'free-discontinuity problem' was introduced by De Giorgi [14] to indicate 
a class of variational problems that consist in the minimization of a functional, involving 
both volume and surface energies, depending on a closed set K C R d , and a function u 
on M. d usually smooth outside of K . In particular, 
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• K is not fixed a priori and is an unknown of the problem; 

• K is not a boundary in general, but a free-surface inside the domain of the problem. 

The best-known example of a free-discontinuity problem is the one modelled by the so- 
called Mumford-Shah functional [29] , which is defined by 

J(u,K):= [ [\Vu\ 2 + u{u- gf]dx + pU^iK^ti). 
Jn\K 

The set £1 is a bounded open subset of R d , a, f3 > are fixed constants, and g G L°°(£l). 
Here % N denotes the N -dimensional Hausdorff measure. Inspired by image processing 
applications, throughout the rest of this paper, the dimension of the underlying Euclidean 
space M. d will always be d = 2 , although in principle the analysis can be conducted in any 
dimension. In fact, in the context of visual analysis, g is a given noisy image that we want 
to approximate by the minimizing function u G W 1 ' (Q \ K) ; the set K is simultaneously 
used in order to segment the image into connected components. For a broad overview on 
free-discontinuity problems, their analysis, and applications, we refer the reader to [1]. 

In fact, the Mumford-Shah functional is the continuous version of a previous discrete 
formulation of the image segmentation problem proposed by Geman and Geman in [24]; 
see also the work of Blake and Zisserman in [5]. Let us recall this discrete approach. Let 
d = 2 (as for image processing problems), = [0, l] 2 , and let Uij = u(hi,hj), £ Tl? 
be a discrete function defined on := Sln/iZ 2 , for h > 0. Define W 2 (t) := min{t 2 ,r 2 } , 
r > , to be the truncated quadratic potential, and 

J h {u) := £ W^ h^-^ 

(hi,hj)e(l h * h 

(hi,hj)e(l h ^ h 

+ ah 2 Ki ~ 9i,j) 2 ■ (3.1) 

(hi,hj)(LQ,i l 

We shall now reformulate the minimization of this finite dimensional discrete problem into 
a linearly constrained minimization of a nonconvex functional of the discrete derivatives. 
For this purpose, we consider the derivative matrix Dh :W l — > ]R 2ri ( ri_1 ) that maps the 
vector (uj + (j_i)„) := (iiij) to the vector composed of the finite differences in the horizontal 
and vertical directions u x and u y respectively, given by 



D h u :-- 







_ u v _ 





(«*)j+n(i-i) : = ( u x)i,j ■= Ui+1 ' j h UiJ , i = l,...,n- l,j = l,...,n 
K)j+(n-i)(i-i) == j == UiJ+1 ~ UiJ , i = 1,... ,n,j = 1,. . . ,n- 



Note that its range ran(Z'/ l ) C M? n ( n x ) is a (n 2 — 1) -dimensional subspace because 



,2 



Df t c = for constant vectors eel™ . It is not difficult to show the representation of any 



,2 



vector u G lR n in terms of the following differentiation-integration formula, given by 



u 



- nt 
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where D k is the pseudo- inverse matrix of Dh (in the Moore-Penrose sense); note that D H 
maps ran(Dh) injectively into W 1 . Also, c is a constant vector that depends on u, and 
the values of its entries coincide with the mean value h 2 ^2^ hi hj)en h u hj °^ u - Therefore, 
any vector u is uniquely identified by the pair (D^u, c) . 

Since constant vectors comprise the null space of Dh , the orthogonality relation 

(tf h D h u,c) = (3.2) 

holds for any vector u and any constant vector c. Here the scalar product (u,u') = 
J2(hi hj)en h u ij u 'i j ^ s standard Euclidean scalar product on IR n , which induces the 
Euclidean norm || • || . 

Using the orthogonality property (3.2), we have that 

\\u-g\\ = \\D[D h u-D h D h g + {c-c g )\\ 
= \\D{D h -D[D h g\\ + \\c-c g \\ 

Hence, with a slight abuse of notation, we can reformulate the original discrete functional 
(3.1) in terms of derivatives, and mean values, by 

J h (v,c) = h 2 [a\\Dlv - g\\ 2 + a\\c- c g \\+^mml\vij\ 2 ,j-\]. 

where v = D^u G M 2ra ( ra ~ 1 ) , and g = D^D^g G . Of course c = c g is again assumed 
at the minimizer u, since this latter term in Jh does not depend on z. However, in order 
to minimize only over vectors in 

R 2n(n-1) that 

are derivatives of vectors in W 1 , we must 
minimize Jh(v,c) subject to the constraint (DhD^ — I)v = 0, and such 2n(n— 1) linearly 
independent constraints actually correspond to a discrete curl-free condition on the vector 
v . 

To summarize, we arrive at the following constrained optimization problem: 
' Minimize J h (y) = h 2 [a\\Tv - g\\ 2 + J^i j mm {\zi,j\ 2 , f } ] ■ 

(3.3) 

subject to Az = 0, 

for T = D{ and A = I - D h D ] h . O nee the minimal derivative vector v is computed, we 
can assemble the minimal u by incorporating the mean value of g as follows: 

u = D[v + Cg. 

3.2 Truncated polynomial minimization 

We define the truncated polynomial scalar function Wr(t) = min{|i| p , r p } , for r > 0, 
p > 1 , and any t £ 1. Returning to our more abstract setting of the first part of this 
paper, we consider again two finite dimensional Hilbert spaces % and %\ and a surjective 
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linear constraint map A : % — > Hi . In addition we consider another finite dimensional 
Hilbert space JC and a linear operator T : % — > K, . Again, to ease the notation, we indicate 
with || • || the Hilbertian norms on H, Hi, or K, indifferently, as they can be subsumed 
from the context where they are applied. For fixed g G K. and / G %i , inspired by the 
reformulation (3.3) of the Mumford-Shah function as a truncated quadratic minimization 
of the derivative vector, in the following we consider the more general functionals of the 
type 

m 

J p (v) = \\Tv-g\\ 2 + 7 ^2 W ?(. v i)^ ( 3 - 4 ) 

i=i 

to be minimized, subject to a linear constraint Av = f , and 7 > is a positive regu- 
larization parameter. Here (vi)'^L 1 are the components of the vector v with respect to a 
fixed basis in the space H of dimension m = dim(%) . 

First of all, we should mention that, independently of the choice of the linear operators 
T and A, by [21, Theorem 2.3], the constrained minimization problem 

m 

Minimize J p {v) = \\Tv - gf + 7^ W r P K), Subject to Av = f, (3.5) 

i=i 

has always minimizers. Notice that this is a remarkable result as the problem is in general 
not coercive. Unfortunately no uniqueness of solutions can instead be guaranteed. 

Remark 3.1. The poof of existence of solutions of (3.5) is based on a special orthogonal 
decomposition of certain convex sets, see [21, Appendix, Section 8.1]. Let us report the 
main fact, which it will turn out to be useful to us again later in this paper. 
Define J p {v) = \\Tv — g\\ 2 + 7 X^=i C «KI P f° r c i> ■ ■ ■ c m scalars; notice that we allow some 
of them to be negative or zero, as soon as Spiv) > Ci n f > —00 for all v G H. Then for 
any constant C > and any polyhedral convex set X C % , there exists a linear subspace 
V = Vx,c C % , such that the orthogonal projection X 1 - of X onto V 1 - has the properties 

• x = {x = x 1 - etv : x 1 - e x^,v e v,t e 

• M c = X 1 - n {v G % : J p (v) < C} is compact, and 

• Jp{it) is constant along rays & = x 1 - © tv , where x 1 - G Mc , v G V, and t G M + . 

For lo Cl and U% '■= {v G H : \v i\ < r, i G Xq and \vi\ > r,i G X \ Xq} , in particular this 
result applies on X = X'(f ) H Wx , hence 

m 

Minimize J p (v) = \\Tv - g\\ 2 + 7^ Ci\t\ p , Subject to Av = f and v G Ui , (3.6) 

i=i 

has solutions in actually in the compact set Mj ^ = X ± n{v G "H : J p {v) < J P {v )}, 
for any v° G "H . 
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3.3 Unconstrained minimization and iterative thresholding 

Despite the nonsmoothness and the nonconvexity of the functional J p , in the work [21] a 
very simple and globally converging iterative algorithm has been studied in order to find 
local minimizers of J v in the case of the unconstrained minimization, i.e., when we are 
omitting the linear constraint Av = f . This case, for instance, solves the minimization of 
the Mumford-Shah function in one dimension, see [21] for details. The method is actually a 
forward-backward or majorization-minimization algorithm of Douglas-Rachford type [27] 
for finding minimal solutions to J p . More precisely, consider the following surrogate 
objective function, 

J^ urr {v,a) :=J p {v) -\\Tv- Ta\\ + || v- a\\, v,a^U. (3.7) 

As ||T|| < 1 can be assumed without loss of generality possibly rescaling 7 and g (see 
[21, Section 3]), the surrogate functional J? ) urr satisfies Jp Urr (v,a) > Jp{v) everywhere, 
with equality if and only if v = a, and is such that the sequence 



v n+1 = argmin X snr >,t> n ) (3.8) 



obtained by successive minimizations of Jp Urr {v,a) in v for fixed a results in a nonin- 
creasing sequence of the original functional ( l 7 p (f ra )) ra eN (see [21, Lemmas 4.1 and 4.2]). 
Moreover, expanding the squared terms on the right hand side of the expression for J* urr , 
we have 

m 

j; arr (v,a) = \\v-(I-T*T)a + T*g\\+ 1 J2 min {\ v i\^ rP } + C 



= E " [° " T * Ta + T *9\i? + 7min{| Vl |f , r?} 



i=i 



+ C, 



where the term C = C(T,a,g) depends only on T, a and g. It is now clear that the 
surrogate functional J v surr decouples in the variables V{ , due to the cancellation of terms 
involving \Tv\ . Because of this decoupling, global u-minimizers of J p surr (v,a), for a 
fixed, can be computed component-wise according to 

Vi = &rgmm\(t-[a-T*Ta + T*g]i) 2 + jWP(t)], i = l,...,m. (3.9) 

The advantage of this strategy is that one can solve (3.9) explicitly when, e.g., p = 2, 
p = 3/2, and p = 1; in the general case p > 1, we have the following result: 

Proposition 3.2 ([21, Proposition 4.3]). 

1. If p > 1, the minimization problem v = arg min^g-^ Jp Urr {v, a) can be solved 
component-wise as in (3.9) by 

Vi = H p ([a-T*Ta + T*g} t ), i = l,...,m, (3.10) 

where H p : R — > R is the 'thresholding function', 
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Here, F p is the inverse of the function F p (t) = t + signt\t\ p 1 , and £' € 
(r, r + ^r p_1 ) is £/ie unique positive value at which 

(F-\a - O 2 + 7\F-\a\ P = 7r p - (3.11) 

2. When p = l, the general form (3.10) still holds, but we have to consider two cases: 
(a) If r > 7/4, the thresholding function Hi : R — >■ R satisfies 



Hi(0 



( 0, |£| < 7/2 

(ICI- 7/2) sign £, 7/2 < ^1 < r + 7/4 = £' (3.12) 
, lei >r + 7 /4 



(b) If, on the other hand, r < 7/4, the function Hi satisfies 

^"U |*|>^ (3 3) 

In all cases, the thresholding function is continuous except at £'(?", 7, p), where it has a 
jump-discontinuity of size 5 = |£' — H p (£')\ > z/ r, 7 > 0. /n particular, £' > r w/u/e 
fl„(0<r. 

In the previous proposition, we used the notation neglecting the parameters r 
and 7 , however actually = H rn , p does depend on them as well and it is characterized 
by 

flr )7 ,p(0 == argmin(t - £) 2 + 7W?(t). (3.14) 

To summarize, the iterative algorithm (3.8) can be recast in terms of a component-wise 
iterative thresholding algorithm, 



v 



n+l _ 



H p ([v n -T*Tv n + T*g]i), (3.15) 



which, for the parameter p = 2 of the truncated quadratic functional as in (3.3), reduces 
simply to 



u 



n+i_J (l + -y)- 1 ([v n -T*Tv n + T*g] i ), \ [v n - T*Tv n + T*g]i\ < (1 + 7 ) 1/2 r 
[v 11 -T*Tv n + T*g]i, else 



See [21, Remark 2] for a more detailed account on how to compute the scalar function H p 
for a generic p>l, see Figure 1 for the cases p = 1, p = 3/2, and p = 2. Notice that H p 
is always discontinuous for any p > 1 . 

Let us now discuss the convergence properties of such an algorithm. For that we define 
the operator M : "H — > % by its component-wise action, 

[M(v)]i := H p ([v - T*Tv + T*g\i); (3.16) 

the iteration (3.15) can then be written more concisely in operator notation as 

v n+1 = M{v n ). (3.17) 

The following convergence result has been proved in [21, Theorem 4.8 and Theorem 5.1]. 
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Figure 1: The discontinuous thresholding functions H\ , H%/ 2 , and H2, with parameters 
p = 1,3/2, and 2, respectively, and r = 1, 7 = 1. 

Theorem 3.3. Suppose p>l. Starting from any v° € ~H, the sequence (v n ) n ^fq defined 
by v n+1 = M n (v°) as in (3.17) will converge to a vector v G ~H that satisfies the fixed point 
condition, 

v = H(v). 

Let us further denote Fix(H) the set of fixed points of M, <S the set of global minimizers 
of J p , and Jz? the set of local minimizers of J v . Then we have the following set inclusions 

W C Fix(M) C if. (3.18) 



Let us remark that the proof of convergence of the algorithm to fixed points in Fix(H) 
is strongly based on the discontinuity of the thresholding function, see [21, Lemma 4.4]. 

3.4 Constrained minimization 

Due to the nonsmoothness and nonconvexity of J v , the more general linearly constrained 
minimization (3.5) has been so far an open problem, as standard methods, such as SQP 
and Newton methods, do not apply, unless one provides a regularization of the problem. 
In particular, it would be desirable that an appropriate algorithm performing such an 
optimization could retain both the simplicity of the thresholding iteration (3.15) and its 
global convergence properties, as given by Theorem 3.3. Certainly the iteration (2.14) 
is a strong candidate, as the iterations of its inner loop actually requires again only a 
unconstrained minimization, which can be again addressed by iterative thresholding, see 
Section 4 below. However, we encounter two major bottlenecks to the direct application of 
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this algorithm to (3.5) .The first problem is that J p does not satisfy our main assumption 
(Al), i.e., it is not w-convex, as it is not a C 1 -perturbation of a convex functional. In fact 
the term Wr is too rough at the kink where the truncation applies. The second trouble 
comes by the lack of coerciveness of J p on the affine space F(f) in general, for a generic 
choice of T . In the next sections we address these two issues. 



3.5 A smoothing method 

In this section we would like to construct an appropriate slightly smoother perturbation 
Jp of J p , which allows eventually for u; -convexity, but does not modify essentially the 
minimizers over J-(f) ■ Moreover, such modification will not affect the possibility of using 
discontinuous thresholding functions in the numerical setting. We will see in Sections 3.7 
and 4 the usefulness of this requirement. For the moment, we will only define this smooth 
perturbation and state its main properties, whose proofs are shifted to the Appendix, for 
an easier reading. 

We start by the following polynomial interpolation result. 

Lemma 3.4. Let < si < s 2 and assume that 

7r(t) := A(t - s 2 f + B(t - s 2 f + C, 
is a third degree polynomial. Given 71,72,73 G M. and by setting 



( C 



73, 



P _ 71 3(73 - 72) /„ 1Q N 

B ~ ^Tl ~ (S2-S1) 2 ' ( 3 - 19 ) 

A - 4- 2B 

~ 3(s 2 -si) 2 ^ 3(s 2 -si)> 



then we have the following interpolation properties 

k{s 2 ) = 73, vr(si) = 7 2 , 
ir'(s 2 ) = 0, 7r'(si)=7i. 



(3.20) 



Proof. The equalities related to s 2 are straightforward, the others related to s\ follow by 
simple direct computations: 

7i 2 
tt(si) = — j{s 2 - si) - -B(s 2 - si) 2 + B(s 2 - si) 2 + ~f 3 

= -y( S 2 - Si) + j(s 2 - Si) 2 + 73 

7i 7i 

= -y (S2 ~ Sl) + y (S 2 - Si) - (73 - 72) + 73 = 72, 

and 

tt'(si) = 3A( Sl - s 2 ) 2 + 2B( Sl - s 2 ) 

= 71 -2B{ Sl -s 2 ) + 2B( Sl -s 2 ) = 71. 



□ 
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Given < e < r for every t € [r — e, r + e] we define 7T p (i) = ir(t) as in Lemma 3.4 for 
si = (r — e) , S2 = (r + e) , 71 = p(r — e) p ~ l , 72 = (r — e) p , and 73 = r v . For example, for 
p = 2 , we have 

7 r 2 (t) = [t+(r - g)][£(r + t) - (r - t)2] , teK. 
4e 



Let us now set, for all t > 0, 



W r p ' £ (i) = < 



7T p (t), r-e<t<r + e, (3.21) 
r p t>r + e, 



whereas for t < 0, we define VF"r' e (t) = Wr ,£ (—t). Notice that now Wr ,£ is actually a 
C 1 -function of R, for all < e < r. 

As announced before, this modification will not affect the possibility of using discon- 
tinuous thresholding techniques in the numerical setting, as we will see in Sections 3.7 and 
4. To state this fact in a convenient way, we need to fix some notation. For £ G R, 7 > 0, 
and r > 0, we consider the function 

4 >r (t) = (f-0 2 + 7 W?(t), (3.22) 
where again W^(t) = min{|t| p , r p } . 

Remark 3.5. It follows from the proof of Proposition 3.2 in [21], see also (3.14), that, for 
fixed 7, r, £ the function g^ r has a unique global minimizer i in 1, which is given by 

where Hp is the thresholding function defined in Proposition 3.2. In particular, due to 
the discontinuity of H p , there exists £o > independent of £ such that for all < e < Eo 
such a minimizer does not belong to the set [— r — e, — r + e] U [r — e, r + e) . 

Let us further define the function 

fi r (t) = (t-0 2 + lWr(t). (3.23) 
Then the following result holds. 
Theorem 3.6. Let sq > as in Remark 3.5 and 



[(5/4)V(p-i) _ ^ 



7pr 



£ < min | £ °< (5/4)1/(p-D r ' • (3 ' 24) 

TTten /| iT - /ias a unique global minimizer in R, which coincides with the one of g^r, that 
is the thresholding function computes the minimizer 

H p (0 = argmin(t - £) 2 + W^{t), 
independently of e as in (3.24). 
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We leave the proof of Theorem 3.6 to the Appendix. 

Thanks to the function Wr' £ , we can define the following perturbation of J v 

m 

Jp(v) = \\Tv - g\\ 2 + 7 W?> £ (vi). (3.25) 

i=i 

About the existence of constrained minimizers of J £ we have the following abstract result, 
whose proof is again shifted to the Appendix. 

Theorem 3.7. For < e < Eq, the problem 

m 

Minimize J £ {v) = \\Tv - gf + 7 ^ W^ £ (vi), Subject to Av = f, (3.26) 

i=i 

has solutions in % . Actually, such minimal solutions can be taken in a compact set M C % 
independent of < e < £o . 

Remark 3.8. The previous result clarifies that, despite the fact that in general J!x are not 
coercive functionals, up to restricting them to an appropriate compact set, independent 
of e , they can be considered equi-coercive. 

Corollary 3.9. The net of Junctionals ( l 7/)o<e<e ^ -converges to J v onF(f). Moreover, 
if we consider the minimizers v* of in M , as constructed in Theorem 3.7 ( which are 
actually minimizers of J £ over F{f) as well), then the accumulation points of such a net 
are minimizers of J p . 

Proof. As Jp converges uniformly to J v on J-(f) , we deduce immediately its T-convergence 
[12]. By Theorem 3.7 and compactness of M we conclude the convergence of minimiz- 
ers. □ 



Proposition 3.10. For all < e < eo , the functional J £ satisfies the properties (Al) 
and (A2), i.e., it is uj -convex, and (2.9) holds. 

Proof. The uj -convexity follows from the piecewise continuity and boundedness of the 
second derivatives of J*. Since W?' £ {t) > and |(W^' £ )'(t)| < pr^ 1 for every t G R, by 
means of the elementary inequality a < ^(a 2 + 1) we obtain 

\\Vj;(v)\\ < 2\\T%Tv-g)\\+ 7 \\((W^)\v 1 ),...,(WrY(v m ))\\ 

< 2||r*||||Tw-5|| +7m 1/ V p_1 (3.27) 

< \\T*\\\\Tv - g\\ 2 + ||r*|| +jm 1/2 pr p - 1 

< \\T* \\J £ {v) + \\T* || + 7m 1 / 2 pr p - 1 . 

Hence, for K = \\T*\\ and L = \\T*\\ + 7 m 1 /V p_1 , we get that (2.9) holds for J*. □ 
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3.6 The application of the algorithm to the Mumford-Shah case 

As we clarified in the previous section, functionals of the type , for < e < Eq , satisfy 
the assumptions (Al) and (A2) for the applicability of the algorithm (2.14). In particular, 
when the algorithm is applied for J = then by Theorem 2.9 the sequence (f^gN 
generated by the algorithm has the properties 

(a) (Ave - f) -> as £ ->■ oo ; 

(b) (v£ — vg-i) — > as £ — > oo. 

However is unfortunately not necessarily coercive on F(f) = {v G % : Av = /}, i.e., 
it does not satisfy (A3) in general, although it retains some coerciveness by considering 
suitable compact subsets M of competitors. Nevertheless, such information does not help 
when it comes to the application of the algorithm (2.14), as there is no natural or simple 
way of restricting or projecting the iterations to such compact sets M . Hence, in order 
to apply Theorem 2.10, we need to explore whether, despite the lack of coerciveness, the 
iterations (vi)( £ ^ generated by the algorithm keep bounded. 

Let us first introduce some specific notation for the application of the algorithm (2.14), 
in particular we denote 

Lemma 3.11. For all < e < Eq, the sequence (\\^J'p(ve)\\)eeN is uniformly bounded, 
where the iterations (vi)£^ are generated by the algorithm (2.14). 

Proof. As a consequence of (2.26) the sequence (||T^||)^, where V£ is defined in (2.21), 
is uniformly bounded. From (2.24), we have also that (||T^||)^ is uniformly bounded. As 
pointed out in (3.27) of Proposition 3.10 actually we have ||V^(u/)|| < (2||T*||)||Tu / - 
g\\ + 7m 1 / 2 pr p ~ 1 . Hence the sequence (||Vj7^(t;^)||)^ e N is uniformly bounded. □ 

The next lemma is stated in a slightly more general form than the one actually needed 
in this subsection, in order to allow its application also to the case later discussed in 
Proposition 3.19. In the statement, given any subset of indexes X\ C X, we denote with 
Pi the orthogonal projection onto the subspace V 1 := {v € % : V{ = 0, i G 

Lemma 3.12. Let us consider any Ii CI (hence it may also be Z\ =1) and A\ = AP\ . 
Then, for all < e < Eq , the sequence (A*qe t L e -i)i e ^ generated by the application of the 
algorithm (2.14) for J = Jp is uniformly bounded. 

Proof. As A\ = APi , then A\ = PiA* , hence it is sufficient to prove that (^4*%,L^— i)^sN 
generated by the application of the algorithm (2.14) is uniformly bounded. By (2.16) we 
have 

A*q e e VJu,v e _A v t) = VJp(v e ) + 2co(v e - v t -i) 

As, by Lemma 3.11 VJp(vi) is uniformly bounded and ve — vg-i — > 0, for £ — > oo, we 
obtain that also A*qi is uniformly bounded. By (2.14), we have also 

A*q t = A*g £iLt . 1 -2XA*(Av t -f), 

from which, together with (Av£ — f) — > for £ — > oo , we eventually deduce the uniform 
boundedness of A*q^L t -\ as well. □ 
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Lemma 3.13. Assume that T is injective on ker A, or kerTnkerA = {0}. Then, for 
all < e < £o, the sequences (vg)g generated by the application of the algorithm (2.14) 
for J = Jp is uniformly bounded. 

Proof. Notice that, by definition of vg in (2.14), necessarily it solves the following linear 
system 

(T* T + l -A*A)v, = ±A*(f + q lM _ x ) + u(v t -i - v e ), 

where the right-hand-side of this equality if uniformly bounded by Lemma 3.12 and The- 
orem 2.9 (b). Moreover, as (Avg — f) — > for t — > oo , we can write that vg is solution of 
the system 

(T*T+\A*A) 
A 



vg = ivg, 



:=G 

where the right-hand-side vj£ is actually uniformly bounded with respect to I. Due to our 
assumption ker T n ker A = {0} , we obtain that ker G = {0} and 

Ve = (G*G)- 1 G*wg, for all £ G N, 

hence the uniform boundedness of {vn)g. □ 

We summarize this list of technical observations into the following convergence result. 

Theorem 3.14. Assume that T is injective on kerA, or kerTnkerA = {0}. Then, 
for all < e < e$ , the sequences (vg)g generated by the application of the algorithm 
(2.14) for J = has at least one accumulation point, and every accumulation point is 
a constrained critical point of on the affine space J-{f) = {v G % : Av = /} . 

Proof. The result follows by a direct application of Theorem 2.10, after having recalled 
the boundedness of (vg)g, which results from Lemma 3.13. □ 

Remark 3.15. The previous convergence result actually applies for the case of the 
Mumford-Shah functional, for which T = d\ and A = I — D^D^ , since d\ is in fact 
injective on ran(D^), see Section 3.1. 



3.7 The application of the algorithm to inverse free-discontinuity prob- 
lems 

When the operator T is actually the composition of d\ l with another noninvertible op- 
erator, say it, S, i.e., T = S o d\, representing a model of a inverse free-discontinuity 
problem [19, 21], then the condition kerTnker A = {0} might not be verified anymore and 
we wonder in this case under which natural conditions the algorithm can still converge. 
For this, we will need to make a finer analysis of the behavior of the algorithm (2.14) when 
applied to J = Jp . 

For the sake of simplicity and without loss of generality, we consider the application 
of (2.14) for A = 1/2, and we define now the strictly convex functional 

Ju,u(v,q) := J^, u (v,q) = J w » + \\\Av - (f + q)\\ 2 . (3.29) 
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We further consider the surrogate functional associated to Ju,u(v,q), given by 

J£f(v,q,w):=J™^{v,q,w)=J u , u (v,ti + (\\v - w\\ 2 - \\Tv - Twf) 

+ (\\v - w\\ 2 - -\\Av - Aw\\ 2 ) 
+ (\\v-w\\ 2 -lo\\v-w\\ 2 ). (3.30) 



Up to rescaling of g, f, q, 7 we can assume here and later and without loss of generality 

. co < 1 . Hence, we have 

JZT(v,q,w)>J u , v (v>Q)> ( 3 - 31 ) 



that ||T|| < 1, ^||^4-|| < 1) and 00 < 1. Hence, we have 



and 

J^T{v,q,w)=J u , u (v,q), (3.32) 

if and only if w = v . 

Proposition 3.16. Assume \\T\\ < 1, < 1> an d oj < 1. Moreover, let eo > as 

in Remark 3.5 and 

. f [(SM) 1 /^-!) - 1] l^prP- 1 } 

e<mm ( e °" (5/4)V(,-i) 4 V[ <3 - 33) 

T/ien 

= argmin J w , u (v,q) (3.34) 

i/ and on/?/ if v* satisfies the following component-wise fixed-point equation: for i = 
l,...,m, 

v* = H p Q I [(/ - T*T) + (/ - \A*A) + (1 - w )i]t,* + (T*g + ±A*(/ + a) + am)} ) , 

(3.35) 

where H p is the thresholding function defined in Proposition 3. 2 for the parameters r and 
7/3 (notice that an additional factor | in fact appears in the last term of (3.33) with 
respect to the corresponding one in (3.24) ). 

Proof. Assume that v* satisfies (3.34). From (3.31) and (3.32), we have the inequalities 

J%T(v*,q,v*) = X>*,</) 

< Ju,u{v,q) 

< s rr (w*). 



Hence we obtain also 



v* = ajgmmj™ r (v,q,v*). (3.36) 
ven 



We notice now by a direct computation that 



\j™u T {v,q,v*) 



v — 



b 1 +b 2 + b 3 



\ 2 m 

+ I E W r' e iVi) + ^ ^> ^), ( 3 - 37 ) 
' i=l 
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where b 1 = (/ -T*T)v* + T* g , b 2 = (I -\A* A)v* + \A*{f + q) , b 3 = (I- ul)v* +cou, and 
C{b l , b 2 , 6 3 , 7) is a term that does not depend on v . One now concludes by an application 
of Theorem 3.6 and Proposition 3.2 that v* satisfies (3.35). 

Conversely, by (3.37), Proposition 3.2, and Theorem 3.6, if v* satisfies (3.35), then it 
also satisfies (3.36). It follows that 

OedJ™ r (v*,q,v*)=dJu, u (v*,q) 

where the last equality trivially follows from (3.30). By convexity of J w , u , this implies 
(3.34). □ 

In the rest of the paper, the following notations will be useful. For r > and v G % 
fixed, we denote Xq := {i G X = {1, . . . , m} : \vi\ < r + e} and T\ =X\Xq. We also define 
H° := {v G H : Vi = 0, i G X{\ and v\ G H 1 := {v £ H : Vi = 0,i £ X } . Let P and Pi 
denote the orthogonal projections onto the subspaces %° and V 1 , respectively. We fix 
the notation for the operator Tj = TPi and Ai = APi for i = 0, 1. 

Remark 3.17. Notice that the fixed-point condition (3.35) in particular implies 

X = {i€l:\v i \<r + e} = {i€l:\v i \<?(r,'y/3,p)-6} 
Xi = {i £l: > r + e} = {i el: |^| > £'(r, 7 /3,p)}, 

In particular, for e > sufficiently small 

^'(r, 7/3, p) - 5<r-e<r<r + e< ^'(r, 7/3, p), 

where ^' = £'(r,7/3,p) is the position of the jump-discontinuity of the thresholding func- 
tion and <5 is the size of its jump-discontinuity, as defined in Proposition 3.2. We call 
this phenomenon, the separation of the components. 

Let us now resume the algorithm (2.14) and see how the separation of the components 
affects its iteration. 

Lemma 3.18 (Fixation of the index set X\). Assume \\T\\ < 1, -^ll^ll < 1> an d oj < 1. 

Moreover, let Eo > as in Remark 3.5 and 

. / [(S/4) 1 /^- 1 ) - 1] 1 7^-^ 
£ < mm \ £0 ' (5/4)V0»-i) r ' 3-2Tj ■ (3 - 38) 

T/ien ; for the sequence (vgjg^ generated by the algorithm (2.14), we consider the time- 
dependent partition of the index set X = {1, . . . , m} into "small" components 

I e :={iel:\(v e )i\<e -6}, (3.39) 

and "large" components 

I[:={i€l:\(vt) i \>?}. (3.40) 

Then for N G N sufficiently large, this partition fixes during the iteration for I > N ; that 
is, there exists Xq such that for all £ > N , Xq =Xq and X[ = X \ Xq . 
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Proof. Notice that, by definition in (2.14) and (3.29), we have actually 



vt = argmin J^^^g^-i). 

Hence, by Proposition 3.16 and Remark 3.17 we have 

(a) \(v t )i\<?-6<?,ifi€7%,OT 

(b) \(v t )i\ >£',if i€Xf, 

Thus, \(v£ + i)i — (vg)i\ > 5 if i G Zq +1 C\T{ , or if i € TqC\T[ +1 . At the same time, Theorem 
2.9 (b), together with the results of the previous sections, imply 

\(vg+i)i - (ve)i\ < \\vg+i - vg\\ < e, (3.41) 

once £ > N(e) , and e > can be taken arbitrarily small. In particular, (3.41) implies 
that Xq and X\ must be fixed once t > N(e) and e < 5. □ 

If we assumed that ker Tnker A ^ {0} , then we could not use anymore Lemma 3.13 to 
infer the uniform boundedness of (vg)g^ which is the key ingredient for again obtaining a 
convergence result as in Theorem 3.14. However, we notice that, due to the separation of 
the components and after fixation of the associated index sets as in Lemma 3.18, actually 
the component Pq{v() , where Pq is the orthogonal projection onto vectors supported on 
To , is uniformly bounded, by definition. If we could obtain an affine dependence of the 
component Pi(vg) supported on X\ on the component Po(vg), then we could infer the 
boundedness of Pi(vg) as well and hence of vg = Po(vg) + Pi(vg) . This is the scope of the 
following result. 

Proposition 3.19. Assume \\T\\ < 1, ^\\A\\ < 1, and uj < 1. Moreover, let Eq > as 
in Remark 3.5 and 

f [{b/A) 1 ^-^ - 1] 1-fprP- 1 ) 



e < mm < eq, — - — — : 77 _ u — r 



(5/4)V(p-i) ' 3 20 J ' 

Assume that Iq and X\ are the index sets produced by the algorithm (2.14) after fixation 
as in Lemma 3.18. Let us consider the restricted operators Tj = TPj and Aj = APj , 
associated to the partition Ij , for j = 0, 1, respectively. If 

ker(I?Ti + = {0}, (3.42) 

then the sequence (vg)g^ generated by the algorithm is uniformly bounded. 

Proof. By minimality of v e = argmin„ 6W Jp^, VlL _A\v, qe,L e -i) , and if we fix i>° := P (vg), 
the vector vj := P±(vg), satisfies 



zeH 

where 



ieXi 
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As \(vj)i\ > £' > r + e, i e T\ , then is also the unique minimizer of 

\\T lZ -(g- T v° e )\\ 2 + l -\\A lZ - (/ + q liLi _ x - A Q v^)\\ 2 + u\\z - Pi^_i|| 2 , (3.43) 

or, else, the vector z* minimizing (3.43) would satisfy J^ Vl _ 1 {z*) < <7u,v e _ 1 ( v e) > in contra- 
diction to the minimality of v\ . The uniqueness of such minimization comes from the strict 
convexity of (3.43). Hence, v\ is also characterized as the solution of the Euler-Lagrange 
equations 

2Tl{T lZ -(g- T v° e )) + A\(A lZ - (/ + q lM ^ - A v° e )) + 2u(z - P^e-i) = 0, 



or 



[T*Ii + \a\A x \v\ = T{(g - T v° e ) + + ^-i - A v° e ) + uP^vt - 

Since ker(T 1 *Ti + \A\A{) = {0} the operator T*T\ + \A\A X is actually invertible and 
hence 



;1 = [ T * ri + ^AJAx]- 1 



We observe now that W£ is uniformly bounded, because, for construction v9 is bounded, 
by Lemma 3.12 A\q^L e -\ is bounded, and by Theorem 2.9 (b) (vg — vg-i) is also bounded. 
We conclude that v\ and therefore ve = v® + v\ are also uniformly bounded. □ 

Having been able to recover the boundedness of the sequence (vg)e e ^ thanks to con- 
dition (3.42), we are now able to conclude this section with a corresponding convergence 
result. 

Theorem 3.20. Assume \\T\\ < I, ^\\A\\ < 1, and lj < 1. Moreover, let e > as in 

Remark 3.5 and 

F < m - m L [(S/4) 1 /^ 1 ) - 1] 1 1P r^ \ 
£ < mm \ £0 ' (5/4)V(p-i) ^ 3^0-/ • 

Assume that Iq and X\ are the index sets produced by the algorithm (2.14) after fixation 
as in Lemma 3.18. Let us consider the restricted operators Tj = TPj and Aj = APj , 
associated to the partition Ij , for j = 0, 1, respectively. If 

ker^T! + ±A\A{) = {0}, 

then the sequence (v^)^ generated by the algorithm (2.14) for J = Jp has at least one 
accumulation point, and every accumulation point is a constrained critical point of on 
the affine space F(f) = {v G H : Av = /} . 

Proof. The result follows by a direct application of Theorem 2.10, after having recalled 
the boundedness of (ve)^, which results from Proposition 3.19. □ 
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Remark 3.21. Although the condition ker(T 1 *Ti + 5^ At) = {0} depends on the actual 
realization of the algorithm, in principle the null space of minors of the matrix T*T+\A*A 
can be checked a priori, identifying the possible admissible configurations of X\ for which 
the algorithm will be guaranteed to perform convergence. It is also known that for certain 
random matrices, restricted injectivity properties are actually ensured as soon as X\ is not 
too large, see for instance [35]. 



4 Iterative Thresholding Algorithms Revisited 

As already mentioned in Section 3.3 an iterative thresholding algorithm can be used for 
identifying local minimizers of the Mumford-Shah functional in one dimension [21]. This 
algorithm is actually very attractive for its exceptional simplicity, and its ability of per- 
forming a separation of components at a finite number of iterations, as we have showed in 
Lemma 3.18. This property highlights the identification of possible discontinuities after 
only a finite number of iterations, in principle allowing then for refinement strategies, when 
it comes to adaptive discretizations, see [7] for a recent approach to adaptive discretization 
in fracture simulation. Hence, we would like to see whether an iterative thresholding algo- 
rithm can play a profitable role also for free-discontinuity problems in higher dimension. 

4.1 Formulation of an iterative thresholding algorithm 

In the inner loop of algorithm (2.14), one has to recursively solve a smooth and uncon- 
strained convex optimization of the type 

v* = axgmmj uu (v,q). (4.1) 
veH 

In principle, as Ju,u{ v ,q) is smooth and strictly convex, one can use relatively simple 
gradient descent methods and this task does not present any particular difficulty. Actually, 
it does constitute one of the strong features of algorithm (2.14), making it very easily 
implement able. Nevertheless, as we pointed out in Proposition 3.16, the vector v* can be 
point-wise characterized by the following fixed-point equation, for i = 1, . . . , m, 

v* = H p Q I [(/ - T*T) + (/ - I A* A) + (1 - co)I]v* + (T*g + ±A*(f + q) + ) . 

(4-2) 

Hence, it is natural to wonder whether the corresponding fixed-point iteration 



v n + l 



H p Q I [(/ - T*T) + l -A*A) + (1 - u)I\v n + (T*g + l -A*{f + q) + 



(4-3) 

generates a sequence (v n ) n gN which converges to v* . The next Theorem, which is again 
based on the separation of components, gives a positive answer to this question. 

Theorem 4.1. Assume ||T|| < 1, ^||^4|| < 1; and oj < 1. Moreover, let Eq > as in 
Remark 3.5 and 

f [(5/4)V(p-i) _ 1] l7prP -n 

e < mm ^ (5/4)1/0.-1) r ' 3-20- 1 ■ (4 - 4) 
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Let v* = argmin^g-^ Ju, u {v,q), and consider the sequence v n defined by the iteration (4.3) 
and its time- dependent partition of the index set X = {1, . . . , m} into "small" components 

lS:={iel:\(v e )i\<^-8}, 

and "large" components 

X?:={i€l:\(vi)i\>e}. 

Then for N G N sufficiently large, this partition fixes during the iteration for n > N ; that 
is, there exists Xq such that for all n > N , Xq = Xq and X™ = X \ Xq . Moreover for every 
m £ N one has 

I^JV+m _ < ^ _ wjm|| v JV _ ( 4 _ 5 ) 

so that in particular v n — > v* as n tends to +00 . 

Proof. The proof follows closely similar arguments in [21, Section 4]. First of all, by 
Proposition 3.2 and Theorem 3.6, if v n satisfies (4.3), then for every n G N 

Now the same argument as in [21, Lemma 4.2] gives that (v n+l — v n ) — > as n tends to 
+00, whence, arguing exactly as in Lemma 3.18, we deduce the property of fixation of the 
indexes. 

After fixation of the index set Xq, for every n > N one has v n+1 = Vx () (v n ), where 
Ux Q is an operator having component- wise action defined by 

[U2b(«)]i = F- 1 Q {[(/ - T*T) + (/ - I A* A) + (1 - u)I}v + (T*g + l -A*(f + q) + am)} ) 
if i G Xq , and 

= \[[(I- T*T) + (/ - l -A*A) + (1 - u)I]v + (T*g + l -A\f + q) + urn)} 

if i Eli. The function F p is here defined as in Proposition 3.2. Using the hypotheses, it 
is easy to show that - T*T) + (I - \A*A) + (1 - uj)I}\\ < 1 - | , therefore, since the 

mapping F^ 1 is nonexpansive, we get that 

Lip(U Xo )<l-|; 

in particular, Ux is a contraction mapping. By Banach fixed point Theorem, we infer 
that (4.5) holds with v* the unique fixed point of Ux . It follows that v* satisfies (4.2), 
so that the proof is concluded by using Proposition 3.16. □ 

Remark 4.2. In view of Theorem 4.1, the algorithm (2.14) applied for J = depends 
on e through the choice of uj = w(e) to make J^p yLJ ,u convex. Actually uo = uj(e) grows 
with e — > 0. In view of the necessary rescaling, this determines a deterioration of the 
convergence quality of the algorithm. In particular the iteration N in (4.5), from which 
the algorithm fixes the index sets Xq and X\ and starts to converge exponentially fast, 
might get delayed. Hence, while e > cannot be chosen too large according to (4.4), it 
should also not be chosen too small. A more detailed analysis of the dependencies on e 
of the convergence properties of the algorithm will be explored in a successive numerical 
analysis work. 
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5 Appendix 

5.1 Proof of Theorem 3.6 

In order to achieve the proof, we need two preliminary lemmas. Let us start with a simple 
technical result. 

Lemma 5.1. Let f : R — > R be of class C 1 and g : R — > R 6e o Lipschitz function. Let 
< e < r and we assume 

1) g has a unique minimizer x in R; 

,2j i/ie minimizer x ^ [— r — e, — r + e] U [r — e, r + e] ; 

3) f = g in R\{(-r - e, -r + e) U (r - e,r + 

4) f is decreasing in (— r — e, — r + e) U (r — e, r + e) . 
ITien x is afeo i/ie unique global minimizer for f . 

Proof. If x G R \ {(— r — e, — r + e) U (r — e, r + e)} then we have 

f(x) = g{x) > g{x) = f(x), 

with equality only if x = x. If x £ (r — e, r+e) and f'{x) = 0, by 4) we would immediately 
obtain by the Fundamental Theorem of Calculus that 

f(x) > min{/(r - e), f(r + e)}. 

Actually the same holds if /' / in (r — e, r + e) , as it follows directly by the Weierstrass 
Theorem. Hence, it follows 

f(x) > min{/(r - e), f(r + e)} = min{st(r - e),g(r + e)} > g(x) = f(x). 

Analogously in the case x G (— r — s, — r + e) . □ 

In the following Lemma we address the concavity of polynomials ir of third degree as 
in Lemma 3.4. 

Lemma 5.2. Referring to the notations of Lemma 3.4, let si = (r — e) , S2 = (r + e) , 
7i = p(r — £) p , 72 = (f — £) p , and 73 = r p . Let us set 6 = — 1, and suppose 

e < ■ Then for all t 6 (r — e,r + e) , we have: 



■*<«) < 

Proof. In this case we have 



v ( r - e^P- 1 3 
v ( r _ £ )p-i b 

A = ± uy + Ts- (5 ' 2 > 
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Since (r — e) p — r p < —ep{r — e) p 1 by convexity, we deduce from (5.1) that 

p(r — e) p_1 



B < —- 



4e 



and therefore, from (5.2) we have also A < 0. But then for all t G (r — e, r + e) , we deduce 
from these negativity relationships and again (5.2) that 

n"{t) = 6A(t - (r + e)) + 2B < -YleA + 2B< _ ^JZlZ 1 _ 2 B. (5.3) 
By the mean value theorem there exists £ G (r — e, r) such that 

r p - (r - = ep^" 1 < epr^ 1 , 

and therefore, from (5.1) 

B > — — ^r 1 — pr p ~ l . 



2e Ae 

By substituting this latter estimate in (5.3) we obtain 

n"(t) < _2fc£)!± + 3,-1 (5.4) 
e 2e 

Now, e < j^r if and only if < S(r — e) — e if and only if r < (1 + 5)(r — e) or 
r p-i < (1 + (5)P _1 (r — e) p ~ l . For the chosen value of 5 , we obtain r p_1 < |(r — e) p_1 or 
— (r — e) p_1 < — |r p_1 . Substituting the latter estimate in (5.4), we eventually have 

K»( t ) < -% p -i + ^r p - 1 = -^r p -\ 
w ~ 5e 2e 10e 

□ 

We can now give the proof of Theorem 3.6. We intend to apply Lemma 5.1. From 
Remark 3.5, by construction and assumption e < Eq , both / = f^ r and g = g^ r verify 
the assumptions 1), 2), and 3) of Lemma 2.2. In order to show that also assumption 4) 
holds, it is sufficient to show that (fj,r)"(t) < for all t G (— r — e, — r + e) U (r — e, r + e) . 
As -^z{t— £) 2 = 2 and by parity of Wr' £ , it is sufficient to check the negativity of (fy, r )"{i) 
for all t G (r — £, r + e) . In this interval, by Lemma 5.2 and construction of f~ />r we have 

(4 r r(t) = 2 + 7 vr"(t)<2-^- 1 <0, 

where we used the assumption (3.24). 



5.2 Proof of Theorem 3.7 

This proof uses a similar approach as for [21, Theorem 2.3]. Let us first consider a partition 
3? = {Uj-j }"j=i of % indexed by all subsets Iq C I, as follows 

W x j = {v G % : \vi\ < r + e,i G Xq, \vi\ > r + e,i G 1\X 3 }. 
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The minimization of over T(f) n U x j can be reformulated as 

f Minimize J*{v) = \\Tv - g\\ 2 + 7 £™i Q^K), Subject to v € 7(/)n^, g 
1 q = if % G X \ Xq and q = 1 if z G Xq , 

where 

{t p , t<r-e, 
vrp(t), r-e<t<r + e, 
|t-e|P t>r + e. 

If we prove that the minimization (5.5) has always a solution v(Tq) for all j = 1, . . . , 2 m , 
and such a minimizer belongs to a compact set M J , independent of e > 0, then 



f = ar 



is actually a solution for (3.26) and it belongs to the compact set M = U^M- 7 , inde- 
pendent of e > 0. Hence, it is sufficient now to address (5.5). For that, we first show the 
following technical observation: 

If x, v G H are fixed and is bounded above and below on the ray R x>v = 
{x + tv,t > 0}, then is actually constant on R X:V . In fact, let us consider the 
function fi(t) = Jp(x + tv) . By the boundedness of J^ix + tv), without loss of gen- 
erality, we can assume that < fi(t) < 1. Hence there exists a sequence (t n ) n C M + 
of points t n —7- +oo for n — > oo such that n(t n ) — > rj G [0,1] for n — > oo. Moreover, 
by definition of ,£ , for i > sufficiently large we have actually the general expression 
/j,(t) = P(t) + 7 YliLi °i\ x i ~ £ + t- v i\ p i where P is a polynomial of degree at most 2. 
Assume now, for instance, that 1 < p < 2 . As lim n = we deduce that all the 

coefficients in P of second degree are actually vanishing. In turn, then = lim„ 
has the implication that for each i one of the coefficients c« or di must vanish as well. 
Following in the same manner, we conclude that all linear coefficients in fi(t) also vanish, 
leaving only the possibility that fx(t) is a constant function. A similar approach can be 
conducted to prove the observation also for p > 2 . 



or 



Notice now that converges uniformly to J v on U X 3 for e — > 0, as defined in (3.6), 
\Jp(v) ~ M v )\ < r ( £ ) 5 for a11 v e U xi , (5.6) 



for a continuous function Vie) = oie), e — > 0. By Remark 3.1, for X = J- if) DU T j and 

any / £ X, there exists a linear subspace V C % , such that the orthogonal projection 
X- 1 - of X onto V 1 - has the properties 

• X = {x = x 1 - © tv : x 1 - G X 1 , v G V, t G R+} , 

• M j c = X 1 - n G U : J p (v) < C} , for C > J^(v°) + r(e) is compact, and 

• Jp{it) is constant along rays £ t = x- 1 © tt> , where x- 1 G M^, t> G V, and i G M. + . 
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By the uniform estimate (5.6) and the last property, we deduce that Jp{it) is bounded 
from above and below by J p {x^-) ± T(e) on rays £j = x 1 - © tv , where x 1 - G Mp, v G V, 
and i G M + . Hence, we conclude that Jp{it) is also constant for t > 0. From (5.6), the 
set 

x 1 - n {v e u : j;{v) < j;(v )}, 

is included in M 3 C , and 

inf J» = mf.J*{v). 
x o ° 

By compactness of M- 7 = and continuity of Jp we conclude the existence of minimiz- 
ers in M J . As pointed out above, this further implies the existence of minimal solutions 
in M = U^^M- 7 of the original problem (3.26). Notice further that, by continuity of 

Sp{v°) +T(e) with respect to e, the sets M- 7 = M° c actually do not depend on < e < Eq 
as soon as C > max 0<e<eo Jp{v°) + T(e) is large enough. 
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